Re: random kernel panic srx300 with 19.1R1.6

January 16, 2020, 11:13 am

≫ Next: Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

≪ Previous: Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

It's a hardware issue, with as Jonas said, the only solution is a RMA to replace the SRX300

still JTAC keeps the PR hidden as confidential. I suppose a proactive recall of alle SRX300 globally is too expensive.

should be nice to know when the storage will be at its end of its lifetime.

We replaced now 6 srx300 in 3 clusters / locations with 6 RMA's.

really bad for juniper reputation, no proactive attitude at JTAC.

↧

Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

January 16, 2020, 12:02 pm

≫ Next: Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

≪ Previous: Re: random kernel panic srx300 with 19.1R1.6

You have the option to use "dynamic-address" and define your own feed server where you store your prefixes. This url shows the concept from a Junos Security Director/Policy Enforcer perspective, but can be configured via cli under "security dynamic-address".

https://www.juniper.net/documentation/en_US/release-independent/spotlight-secure/topics/reference/general/secure-connector-dynamic-address-group-overview.html

You can also buy a "threat feeds only" subscription for your SRX running at least 15.1X49 or newer where GeoIP is included and you can the allow/reject per country code. Example configuration found here: https://www.juniper.net/documentation/en_US/release-independent/jatp/topics/task/configuration/jatp-srx-integration-geoip-commands.html

↧

Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

January 16, 2020, 5:56 pm

≫ Next: Re: SRX chassis cluster - DHCP server does not work

≪ Previous: Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

It looks like dynamic-address groups look like a possible option. Thanks for pointing to this feature.

Is there any public docs on how to interface with this to build a feed server? I figured I could just write my own if the protocol i s simple enough. Any details on the format, TCP/UDP ports, authentication,etc. etc?

Thanks,

-J

↧

Re: SRX chassis cluster - DHCP server does not work

January 17, 2020, 12:31 am

≫ Next: BGP with 2 upstreams and default route only

≪ Previous: Re: Store a prefix-list in a file and reference it instead of storing full prefix-list in configuration?

Thank you for all your replies. Bofh were helpful.

The "primary" parameter on IP address resolved the issue.

↧

BGP with 2 upstreams and default route only

January 17, 2020, 3:24 am

≫ Next: Betreff: BGP with 2 upstreams and default route only

≪ Previous: Re: SRX chassis cluster - DHCP server does not work

Hello,

I configured 2 bgp sessions to receive only 0.0.0.0 from upstreams, so the traffic goes only via 1 upstream (probably the older session). It looks like this:

admin@SRX1# run show route 0.0.0.0    

inet.0: 165304 destinations, 165307 routes (19 active, 0 holddown, 165285 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both

0.0.0.0/0          *[BGP/170] 01:40:02, localpref 100
                      AS path: 1234 I, validation-state: unverified
                    > to 1.1.1.1 via ge-5/0/15.0
                    [BGP/170] 00:00:30, localpref 100
                      AS path: 6789 ?, validation-state: unverified> to 2.2.2.2 via ge-0/0/12.0

My current bgp configuration:

admin@SRX1# show protocols bgp 
group bgp-isp {
    type external;
    import import-default-route;
    export send-my-prefix;
    neighbor 1.1.1.1 {
        description isp1-bgp;
        peer-as 1234;
    }
    neighbor 2.2.2.2 {
        description isp2-bgp;
        peer-as 6789;
    }
}

I tried to add local-pref in neighbor configuration, but without success. I also tried to apply local-pref to import policy, but without success. Always in "show route" I see that local pref didn't change.

What would be the proper way to to this?

↧

Betreff: BGP with 2 upstreams and default route only

January 17, 2020, 4:44 am

≫ Next: Re: BGP with 2 upstreams and default route only

≪ Previous: BGP with 2 upstreams and default route only

Hello Gabriel,

you can configure the Local Preference the following way, if you want to prefer neighbor 1.1.1.1 for egress traffic:

-------------

policy-options {
  policy-statement import-default-route {
    term 1 {
          from {
             route-filter 0.0.0.0/0 exact;
               }
        then {
            local-preference 110;
            accept;
        }
    }
  }
}

group bgp-isp {
    type external;
    export send-my-prefix;
    neighbor 1.1.1.1 {
        description isp1-bgp;
        import import-default-route;
        peer-as 1234;
    }
    neighbor 2.2.2.2 {
        description isp2-bgp;
        peer-as 6789;
    }
}

HTH

↧

Re: BGP with 2 upstreams and default route only

January 17, 2020, 6:31 am

≫ Next: Weird issue when disabling an interface (ARP related?)

≪ Previous: Betreff: BGP with 2 upstreams and default route only

Hi,

Not sure what is your question? Do you want traffic to load balance between 2 ISP links?

If so, you need to do a few things:

1. In routing-table, it looks correct as you have two candicate routes

2. In forwarding-table, if you need to have 2 next-hops, you need to have load-balance per-packet policy under "routing-options forwarding-table"

3. In your case, you might also need bgp multipath multiple-as

↧

Weird issue when disabling an interface (ARP related?)

January 17, 2020, 8:21 am

≫ Next: Re: Weird issue when disabling an interface (ARP related?)

≪ Previous: Re: BGP with 2 upstreams and default route only

Hi guys,

I'm facing an issue on a SRX300 cluster. It's configuration is quite simple , as it's just used to keep a VPN tunnel open and route some traffic though it. Originally there were one more VPN tunnel, not in use anymore.

We had a reth interface with a public IP set for one of the tunnels (say TUNNEL-A) and another reth interface with a public IP set for the other tunnel (both IP's were within the same range). e.g.:

reth1 {
        description IF_TUNNEL-A;
        redundant-ether-options {
            redundancy-group 1;
        }
        unit 0 {
            family inet {
                address 80.80.80.100/24;
            }
        }
    }
    reth2 {
        description IF_TUNNEL-B;
        redundant-ether-options {
            redundancy-group 1;
        }
        unit 0 {
            family inet {
                address 80.80.80.101/24 {
                                    }
            }
        }

Say default gw is 80.80.80.1

The thing is that I want to disable the interface reth1. When I commit, the VPN tunnel goes down, and no internet traffic at all.

I thought that it may be related to arp entries of the IP of the default gateway (80.80.80.1) associated to the interface I want to disable, so I tried clearing arp entries and setting them as static entries under reth2. But when I tried to disable reth1, same issue occurred.

Any clue?

Thanks

↧

Re: Weird issue when disabling an interface (ARP related?)

January 17, 2020, 8:22 am

≫ Next: Re: BGP with 2 upstreams and default route only

≪ Previous: Weird issue when disabling an interface (ARP related?)

BTW I tried also to remove the IP of the IF, same result

↧

Re: BGP with 2 upstreams and default route only

January 17, 2020, 1:57 pm

≫ Next: Re: Weird issue when disabling an interface (ARP related?)

≪ Previous: Re: Weird issue when disabling an interface (ARP related?)

Hi Gabriel,

Greetings, you have two options to easily accomplish this task. You can create an import policy to modify the rip-in table before it reaches/installed in the rip local and set the local preference to something higher than 100 or you can create an export policy set the local preference to something higher than 100
before exporting routes to your IBGP peers.

Can you please share how and where did you configure the local preference statement or the policy?

Regards,
Lil Dexx JNCIE-ENT#863

If this solves your problem, please mark this post as "Accepted Solution" so we can help others too.
If you consider that my input was helpful, giving me a kudos would make my daySmiley Happy

↧

Re: Weird issue when disabling an interface (ARP related?)

January 17, 2020, 4:14 pm

≫ Next: Local Web Filtering

≪ Previous: Re: BGP with 2 upstreams and default route only

How is your default route configured on the cluster?

And is that route still active after the changes?

Which ip address is the gateway for the vpn that goes down?

↧

Local Web Filtering

January 18, 2020, 2:04 am

≫ Next: Re: Local Web Filtering

≪ Previous: Re: Weird issue when disabling an interface (ARP related?)

Does anyone have the 'set' commands to create a blocked rule for web sites local web filtering on and SRX.

I have over 8000 websites to block. As an example lets say I want to block www.hackthissite.org?

Thanks

↧

Re: Local Web Filtering

January 18, 2020, 10:36 am

≫ Next: Help identifying cause of millions of fragmented packets and loss of connection

≪ Previous: Local Web Filtering

Something like this should get you started:

set security utm custom-objects url-pattern Denied-websites value http://www.hackthissite.org
set security utm custom-objects url-pattern Denies-websites value http://*.hackanothersite.org
set security utm custom-objects custom-url-category BLACKLISTED value Denied-websites
set security utm feature-profile web-filtering url-blacklist BLACKLISTED
set security utm feature-profile web-filtering type juniper-local
set security utm feature-profile web-filtering juniper-local profile UTM-local default permit
set security utm feature-profile web-filtering juniper-local profile UTM-local custom-block-message "Access blocked"
set security utm feature-profile web-filtering juniper-local profile UTM-local fallback-settings default block
set security utm feature-profile web-filtering juniper-local profile UTM-local fallback-settings too-many-requests block
set security utm utm-policy UTM-local web-filtering http-profile UTM-local

set security policies from-zone Trust to-zone Untrust policy block-websites match source-address any
set security policies from-zone Trust to-zone Untrust policy block-websites match destination-address any
set security policies from-zone Trust to-zone Untrust policy block-websites match application any
set security policies from-zone Trust to-zone Untrust policy block-websites then permit application-services utm-policy UTM-local
set security policies from-zone Trust to-zone Untrust policy block-websites then log session-init
set security policies from-zone Trust to-zone Untrust policy block-websites then log session-close

Note #1: specific set commands are not tested so typos can be present.

Note #2: This kind of blocking doesn't require a license.

↧

Help identifying cause of millions of fragmented packets and loss of connection

January 18, 2020, 7:47 pm

≫ Next: About replacement of ScreenOS commands with JunosOS commands

≪ Previous: Re: Local Web Filtering

Hello Juniper community! I'm new to Juniper, and I'm at a PreK-12 school with about 800 students and 100 faculty. The last couple weeks we've experienced frequent events where we lose Internet connection for about 10 minutes, these happen between 2-10 times a day. These events have been going on a couple times a week all school year, but the TP-link TL-ER5120 we had been using would only max out CPU and so I had no visibility into what was going on. We replaced that with SRX210 on January 7, but that was also maxing out CPU, sessions per second, and max sessions, so we upgraded to an SRX340 [15.1X49-D150.2] on January 16. Now I finally have some insight into what might be happening, but I need some help identifying a root cause.

During one of these events, a typical output for show security flow session summary looks like this:

inet-fw1> show security flow session summary
Unicast-sessions: 26170
Multicast-sessions: 0
Failed-sessions: 125334
Sessions-in-use: 46416
  Valid sessions: 26253
  Pending sessions: 0
  Invalidated sessions: 20163
  Sessions in other states: 0
Maximum-sessions: 262144

Which shows tens of thousands of invalidated sessions and hundreds of thousands of failed sessions.

For reference, this is what normal use looked like a few minutes before:

inet-fw1> show security flow session summary
Unicast-sessions: 23597
Multicast-sessions: 0
Failed-sessions: 0
Sessions-in-use: 23822
  Valid sessions: 23590
  Pending sessions: 0
  Invalidated sessions: 232
  Sessions in other states: 0
Maximum-sessions: 262144

inet-fw1> show security flow session nat summary
Valid sessions: 23557
Pending sessions: 0
Invalidated sessions: 180
Sessions in other states: 0
Total sessions: 23737

After each event I cleared security flow statistics to reset the counters to 0, and after an event show security flow statistics looked like this:

inet-fw1> show security flow statistics
    Current sessions: 19784
    Packets forwarded: 5265663
    Packets dropped: 3270801
    Fragment packets: 3322634
    Pre fragments generated: 0
    Post fragments generated: 0

Fragment packets are usually 0 or single digit numbers until one of these events start, and after the SRX has recovered and the fragment packets stops incrementing, I can take the number of fragmented packets and divide by the number of minutes and come up with a number around 20k packets per second. This seems roughly equivalent to the pps on our ge-0/0/0 interface to the Internet using monitor interface traffic, which means nearly every packet is being fragmented for a period of 5-10 minutes. According to our ISP, events like this correspond with a spike in our bandwidth utilization saturating our 200Mbps fiber connection. However, I don't know which is the cause and which is the effect. That is, are we over-utilizing our fiber connection and so packets are fragmenting which causes the session drop, or are events like this causing all the sessions to re-establish all at once, which maxes out our bandwidth? FWIW, here is a graph from our ISP of this week with red dots indicating events for which I recorded the times: bandwidth utilization graph.png

I can definitely report that not all periods of maxed bandwidth result in these events, so I'm not convinced that is the cause. However, I don't see evidence (via show security flow statistics) that these occur overnight or on weekends when there aren't users on campus.

CPU usage does not seem like an issue, and show chassis routing-engine never shows anything concerning. However, we're probably reaching the SRX340 session-creation-per-second limit during these times:

inet-fw1> show security monitoring performance spu
.fpc  0  pic  0
Last 60 seconds:
 0:  34   1:  28   2:  34   3:  28   4:  35   5:  34
 6:  29   7:  34   8:  27   9:  35  10:  29  11:  35
12:  28  13:  34  14:  28  15:  34  16:  27  17:  35
18:  34  19:  31  20:  35  21:  28  22:  34  23:  28
24:  34  25:  28  26:  35  27:  29  28:  34  29:  28
30:  34  31:  34  32:  28  33:  34  34:  28  35:  35
36:  28  37:  33  38:  27  39:  34  40:  28  41:  34
42:  28  43:  34  44:  32  45:  32  46:  34  47:  28
48:  34  49:  28  50:  35  51:  27  52:  34  53:  27
54:  34  55:  28  56:  33  57:  26  58:  33  59:  34

inet-fw1> show security monitoring fpc 0
FPC 0
  PIC 0
    CPU utilization          :   28 %
    Memory utilization       :   50 %
    Current flow session     : 52459
    Current flow session IPv4: 52459
    Current flow session IPv6:    0
    Max flow session         : 262144
Total Session Creation Per Second (for last 96 seconds on average): 7857
IPv4  Session Creation Per Second (for last 96 seconds on average): 7857
IPv6  Session Creation Per Second (for last 96 seconds on average):    0

I have typical screens set up for both trust and untrust interfaces, and the only non-zero values are the following:

inet-fw1> show security screen statistics zone trust
  TCP SYN flood                              9820
      SYN flood source                       9820
      SYN flood destination                  0
  IP spoofing                                3532
  TCP FIN no ACK                             1
  IP block fragment                          196

inet-fw1> show security screen statistics zone untrust
  IP tear drop                               88

I'm not ruling out attacks (either from the outside or from knucklehead students on the inside), but I'm not sure if these are causes or symptoms. I don't know how the counter increments (for a legitimate tear drop attack, should the counter increment for each packet?), but it seems like in 3322634 fragmented packets, it's likely that 88 of them could be identified as a teardrop attack whether it is one or not. Also when establishing so many sessions in a short amount of time, it seems likely that the firewall would perceive that as a SYN flood.

I've played around with values for security screen ids-option trust-screen limit-session source-ip-based to try and limit the sessions that are being under normal circumstances to try and smooth out any spikes. 100 doesn't seem to prevent the events, 20 seems like we're back in dial-up times so I didn't leave it there long enough to test, and 50 seems restricting to end users but still doesn't prevent these events from happening.

Pings from the shell to an IP address a couple hops up the ISP chain time out for most of the event, and the only thing in the logs that seems to correspond with one of these events is:

TOPO_CH: for Instance 0 in  routing-instance default received on port ge-0/0/1.0

ge-0/0/1.0 is my trust interface, but again, I'm not sure if this is a cause or symptom.

Here is the output for show system statistics tcp for the first 48 hours this device was in production (11am Thursday to 11am Saturday):

Tcp:
         1090164 packets sent
                 288028 data packets (44316564 bytes)
                 936 data packets retransmitted (888201 bytes)
                 0 resends initiated by MTU discovery
                 751285 ack only packets (2199 packets delayed)
                 0 URG only packets
                 0 window probe packets
                 10 window update packets
                 100184 control packets
         1872506 packets received
                 262132 acks(for 44312689 bytes)
                 1494766 duplicate acks
                 0 acks for unsent data
                 19204  packets received in-sequence(2582789 bytes)
                 747406 completely duplicate packets(55306 bytes)
                 29 old duplicate packets
                 53 packets with some duplicate data(14654 bytes duped)
                 264 out-of-order packets(19040 bytes)
                 0 packets of data after window(0 bytes)
                 0 window probes
                 202 window update packets
                 7 packets received after close
                 23 discarded for bad checksums
                 0 discarded for bad header offset fields
                 0 discarded because packet too short
         49791 connection requests
         625 connection accepts
         9 bad connection attempts
         0 listen queue overflows
         646 connections established (including accepts)
         51211 connections closed (including 38 drops)
                 14 connections updated cached RTT on close
                 14 connections updated cached RTT variance on close
                 3 connections updated cached ssthresh on close
         49761 embryonic connections dropped
         260991 segments updated rtt(of 308477 attempts)
         605 retransmit timeouts
                 18 connections dropped by retransmit timeout
         0 persist timeouts
                 0 connections dropped by persist timeout
         746940 keepalive timeouts
                 746933 keepalive probes sent
                 7 connections dropped by keepalive
         59440 correct ACK header predictions
         9342 correct data packet header predictions
         666 syncache entries added
                 63 retransmitted
                 38 dupsyn
                 0 dropped
                 625 completed
                 0 bucket overflow
                 0 cache overflow
                 32 reset
                 9 stale
                 0 aborted
                 0 badack
                 0 unreach
                 0 zone failures
         0 cookies sent
         0 cookies received
         4 SACK recovery episodes
         3 segment retransmits in SACK recovery episodes
         1429 byte retransmits in SACK recovery episodes
         119 SACK options (SACK blocks) received
         23 SACK options (SACK blocks) sent
         0 SACK scoreboard overflow
         0 ACKs sent in response to in-window but not exact RSTs
         0 ACKs sent in response to in-window SYNs on established connections
         0 rcv packets dropped by TCP due to bad address
         0 out-of-sequence segment drops due to insufficient memory
         49804 RST packets
         6 ICMP packets ignored by TCP
         0 send packets dropped by TCP due to auth errors
         0 rcv packets dropped by TCP due to auth errors
         0 outgoing segments dropped due to policing

And here's another useful output

inet-fw1> show interfaces ge-0/0/0 extensive | find "Flow error statistics"
    Flow error statistics (Packets dropped due to):
      Address spoofing:                  0
      Authentication failed:             0
      Incoming NAT errors:               0
      Invalid zone received packet:      0
      Multiple user authentications:     0
      Multiple incoming NAT:             0
      No parent for a gate:              0
      No one interested in self packets: 0
      No minor session:                  0
      No more sessions:                  0
      No NAT gate:                       0
      No route present:                  0
      No SA for incoming SPI:            0
      No tunnel found:                   0
      No session for a gate:             0
      No zone or NULL zone binding       0
      Policy denied:                     19312
      Security association not active:   0
      TCP sequence number out of window: 122
      Syn-attack protection:             0
      User authentication errors:        0
    Protocol inet, MTU: 1500, Generation: 153, Route table: 0
      Flags: Sendbcast-pkt-to-re, Is-Primary
      Addresses, Flags: Is-Default Is-Preferred Is-Primary

We have 2 lightly-used web servers in our dmz (that interface shows 15.4MB, 34k packets input and 3.9MB, 30k packets output over 48 hours), but I unplugged that cable anyway to rule them out as a cause and we still experienced events, so I don't think they are to blame.

We also have an inline cachebox and content filter between the SRX340 and our Cisco WS-C3850-12S core switch, and still experienced the issue when they were bypassed so I think I can rule them out as well.

I added the following to security flow, but the security-trace log file is still 0 bytes, so I'm not sure what I need to do to get that working since I think that will be very helpful:

traceoptions {
    file size 200k files 5 world-readable;
    flag fragmentation;
    flag session;
    flag tcp-basic;
    rate-limit 1000;
}

I'm hoping that someone can help me use the tools this Juniper box has to understand what might be causing these events since I'm still learning. These inturruptions have been very disruptive for the last 2 weeks since school resumed, and I'm hoping with a long weekend I might be able to figure something out, but I'm out of ideas.

Thanks in advance

↧

About replacement of ScreenOS commands with JunosOS commands

January 19, 2020, 6:06 pm

≫ Next: the process "pagezero" is so high in SRX650

≪ Previous: Help identifying cause of millions of fragmented packets and loss of connection

How are the following commands in ScreenOS replaced in JunosOS @SRX?

===
set interface eth0/0 manage mtrace
===

Are the following commands suitable?

===
set security zones security-zone trust host-inbound-traffic system-services traceroute
===

Regards,

↧

the process "pagezero" is so high in SRX650

January 19, 2020, 10:27 pm

≫ Next: Re: Weird issue when disabling an interface (ARP related?)

≪ Previous: About replacement of ScreenOS commands with JunosOS commands

Hi, Juniper experts,

Since last week, we found the system processes "pagezero" and "flowd_octeon_hm" were so high that caused CPU utilization (CPU Kernal) was so high as below:

root@SRX650-A> show version
node0:
--------------------------------------------------------------------------
Hostname: SRX650-A
Model: srx650
JUNOS Software Release [12.3X48-D40.5]

node1:
--------------------------------------------------------------------------
Hostname: SRX650-B
Model: srx650
JUNOS Software Release [12.3X48-D40.5]

{primary:node0}
root@SRX650-A>

root@SRX650-A> show chassis routing-engine
node0:
--------------------------------------------------------------------------
Routing Engine status:
Temperature 35 degrees C / 95 degrees F
CPU temperature 35 degrees C / 95 degrees F
Total memory 2048 MB Max 1495 MB used ( 73 percent)
Control plane memory 1072 MB Max 697 MB used ( 65 percent)
Data plane memory 976 MB Max 800 MB used ( 82 percent)
CPU utilization:
User 59 percent
Background 0 percent
Kernel 41 percent
Interrupt 0 percent
Idle 0 percent
Model RE-SRXSME-SRE6
Serial ID ACNK2809
Start time 2018-06-26 16:36:36 GMT+4
Uptime 572 days, 9 hours, 41 minutes, 23 seconds
Last reboot reason 0x200:normal shutdown
Load averages: 1 minute 5 minute 15 minute
2.30 2.29 2.25

node1:
--------------------------------------------------------------------------
Routing Engine status:
Temperature 35 degrees C / 95 degrees F
CPU temperature 35 degrees C / 95 degrees F
Total memory 2048 MB Max 1331 MB used ( 65 percent)
Control plane memory 1072 MB Max 536 MB used ( 50 percent)
Data plane memory 976 MB Max 800 MB used ( 82 percent)
CPU utilization:
User 8 percent
Background 0 percent
Kernel 2 percent
Interrupt 0 percent
Idle 91 percent
Model RE-SRXSME-SRE6
Serial ID ACNK2663
Start time 2018-06-26 16:21:42 GMT+4
Uptime 572 days, 9 hours, 56 minutes, 15 seconds
Last reboot reason 0x200:normal shutdown
Load averages: 1 minute 5 minute 15 minute
0.04 0.06 0.07

{primary:node0}
root@SRX650-A>

root@SRX650-A> show system memory
node0:
--------------------------------------------------------------------------
System memory usage distribution:
Total memory: 2097152 Kbytes (100%)
Reserved memory: 1061628 Kbytes ( 50%)
Wired memory: 1122500 Kbytes ( 53%)
Active memory: 310960 Kbytes ( 14%)
Inactive memory: 197376 Kbytes ( 9%)
Cache memory: 229416 Kbytes ( 10%)
Free memory: 160560 Kbytes ( 7%)
Memory disk resident memory: 60348 Kbytes
VM-Kbytes( % ) Resident( % ) Map-name
619676(59.09) 169880(00.00) kernel
Pid VM-Kbytes( % ) Resident( % ) Process-name
98841 6768(01.29) 1504(00.00) /sbin/pma

root@SRX650-A> show system processes summary
node0:
--------------------------------------------------------------------------
last pid: 98838; load averages: 2.18, 2.26, 2.24 up 572+09:40:25 02:16:30
177 processes: 29 running, 136 sleeping, 1 zombie, 11 waiting

Mem: 304M Active, 193M Inact, 1096M Wired, 224M Cache, 112M Buf, 156M Free
Swap:

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
38 root 1 171 52 0K 16K RUN 0 36:32 6709.77% pagezero
1421 root 15 139 0 1026M 80380K CPU11 b ??? 1011.87% flowd_octeon_hm
1092 root 1 139 0 15104K 5180K RUN 0 50.7H 33.06% eventd
89802 root 1 139 0 46944K 26340K RUN 0 34.3H 30.81% mgd

node1:
--------------------------------------------------------------------------
last pid: 53464; load averages: 0.03, 0.06, 0.07 up 572+09:55:18 02:16:30
128 processes: 24 running, 91 sleeping, 1 zombie, 12 waiting

Mem: 174M Active, 167M Inact, 1096M Wired, 165M Cache, 112M Buf, 369M Free
Swap:

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1419 root 15 139 0 1026M 80380K CPU11 b ??? 1029.00% flowd_octeon_hm
22 root 1 171 52 0K 16K RUN 0 ??? 81.20% idle: cpu0

{primary:node0}
root@SRX650-A>

Is the memory size issue ?

Any advice/suggestion ?

Many thanks in advance.

↧

Re: Weird issue when disabling an interface (ARP related?)

January 20, 2020, 8:11 am

≫ Next: Re: (No) traffic through Dynamic VPN. Sometimes

≪ Previous: the process "pagezero" is so high in SRX650

Hi Steve,

How is your default route configured on the cluster?

- There are bassically 2 routes, the one to the internet router IP (80.80.80.1) and the one to the private IP space of the other end of the tunnel (st0.2):

route 0.0.0.0/0 next-hop 80.80.80.1;
route 10.70.0.0/23 next-hop st0.1

And is that route still active after the changes?

- The routing configuration hasn't been changed.

Which ip address is the gateway for the vpn that goes down?

- 80.80.80.101

See the IKE config for that tunnel, the production one, the only that is active:

gateway PROD_VPN_gateway {
ike-policy policy-PROD-VPN;
address 222.222.222.222;
no-nat-traversal;
local-identity distinguished-name;
external-interface reth2.0;
general-ikeid;

BTW today I tried to change the IP of the reth1 to a fake one (just in case the issue may be related to the fact that reth1 and reth2 have IP's of the same subnet), and evything went fine after the change... but when I connected to the network the new firewall with the old IP configured (80.80.80.100), the tunnel crashed again.

Thanks!

↧

Re: (No) traffic through Dynamic VPN. Sometimes

January 20, 2020, 12:36 pm

≫ Next: Re: the process "pagezero" is so high in SRX650

≪ Previous: Re: Weird issue when disabling an interface (ARP related?)

After applying the changes from this doc my testing now has a 100% success rate.

https://kb.juniper.net/InfoCenter/index?page=content&id=TSB17441&act=login

That link was added to the TSB17441 page in December incase anybody missed it.

↧

Re: the process "pagezero" is so high in SRX650

January 20, 2020, 6:04 pm

≫ Next: Re: the process "pagezero" is so high in SRX650

≪ Previous: Re: (No) traffic through Dynamic VPN. Sometimes

Hi Ben,

This process is responsible for packet handling, data processing, or flow processing. The flow processing is all done on the data plane. Flowd_octeon being high is a consequence and not a cause, when other processes such as eventd,snmpd,httpd go high.

In other words, you will need to find out why those processes are high and resolve this issue first.

Please check this article, this will help you understand and troubleshoot the problem: https://www.tunnelsup.com/troubleshooting-high-cpu-on-juniper-srx-junos-devices/

Regards,
Lil Dexx JNCIE-ENT#863

If this solves your problem, please mark this post as "Accepted Solution" so we can help others too \Smiley Happy/

↧

Re: the process "pagezero" is so high in SRX650

January 20, 2020, 6:11 pm

≫ Next: Re: About replacement of ScreenOS commands with JunosOS commands

≪ Previous: Re: the process "pagezero" is so high in SRX650

Hi, LilDux,

We found the root cause, it is due to some services/daemons are activated and locked in the kernal, so the "pagezero" process is high but can not be identied.

The CPU high was solved after the services are killed....for our case the service is "netconf".

Thanks so much for your kind help indeed.

Cheers

Benson

↧